Bandit and covariate processes, with finite or non-denumerable set of arms
نویسندگان
چکیده
We introduce herein a new approach to nonparametric multi-armed bandit theory involving both the and covariate processes. Following Berry et al. (1997), we assume non-denumerable set of arms for process. The develop can be readily extended continuous-time processes by using ?-greedy randomization arm elimination instead dynamic allocation indices. It also carries out stochastic search with O(1) expected time nearly optimal at values in given B before applying elimination. procedure is shown attain asymptotically minimal rates regret over B.
منابع مشابه
Denumerable Constrained Markov Decision Processes and Finite Approximations
The purpose of this paper is two fold. First to establish the Theory of discounted constrained Markov Decision Processes with a countable state and action spaces with general multi-chain structure. Second, to introduce nite approximation methods. We deene the occupation measures and obtain properties of the set of all achievable occupation measures under the diierent admissible policies. We est...
متن کاملMulti-armed Bandit Problems with Strategic Arms
We study a strategic version of the multi-armed bandit problem, where each arm is an individual strategic agent and we, the principal, pull one arm each round. When pulled, the arm receives some private reward va and can choose an amount xa to pass on to the principal (keeping va−xa for itself). All non-pulled arms get reward 0. Each strategic arm tries to maximize its own utility over the cour...
متن کاملDenumerable Constrained Markov Decision Problems and Finite Approximations Denumerable Constrained Markov Decision Problems and Finite Approximations
The purpose of this paper is two fold. First to establish the Theory of discounted constrained Markov Decision Processes with a countable state and action spaces with general multi-chain structure. Second, to introduce nite approximation methods. We deene the occupation measures and obtain properties of the set of all achievable occupation measures under the diierent admissible policies. We est...
متن کاملOn Non-denumerable Graphs
PROOF. We shall first prove that every complete graph of power t$i can be split up into the countable sum of trees. Let G be a complete graph of cardinal number ML Let {xa}, a<coi, be any well ordered set of power fc$i. We may assume that G is represented by a system of Segments (xa, Xp), a</3<coi. For any /3<coi arrange the set of all a< /3 into a sequence ap,n, n — \, 2, • • • , and let Gn be...
متن کاملDynamic Pricing under Finite Space Demand Uncertainty: A Multi-Armed Bandit with Dependent Arms
We consider a dynamic pricing problem under unknown demand models. In this problem a seller offers prices to a stream of customers and observes either success or failure in each sale attempt. The underlying demand model is unknown to the seller and can take one of N possible forms. In this paper, we show that this problem can be formulated as a multi-armed bandit with dependent arms. We propose...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Stochastic Processes and their Applications
سال: 2022
ISSN: ['1879-209X', '0304-4149']
DOI: https://doi.org/10.1016/j.spa.2022.03.010